Skip to content

PR #1028: TPC Model Implementation for ICU Length-of-Stay Prediction#1060

Closed
tarakjc2c wants to merge 22 commits intosunlabuiuc:masterfrom
tarakjc2c:pr-1028
Closed

PR #1028: TPC Model Implementation for ICU Length-of-Stay Prediction#1060
tarakjc2c wants to merge 22 commits intosunlabuiuc:masterfrom
tarakjc2c:pr-1028

Conversation

@tarakjc2c
Copy link
Copy Markdown

@tarakjc2c tarakjc2c commented Apr 21, 2026

PR #1028: TPC Model Implementation for ICU Length-of-Stay Prediction

Contributors

Pankaj Meghani (meghani3), Tarak Jha (tarakj2), Pranash Krishnan (pranash2)

Summary

This PR implements the Temporal Pointwise Convolutional Networks (TPC) model from Rocheteau et al. (CHIL 2021) for ICU length-of-stay prediction in the PyHealth framework.

https://arxiv.org/abs/2007.09483

Implementation Overview

1. Model Implementation (pyhealth/models/tpc.py)

  • Complete TPC architecture with depthwise separable convolutions
  • Temporal convolutions with channel groups (per-feature processing)
  • Pointwise convolutions for cross-feature interactions
  • Temporal attention mechanism
  • Custom loss functions:
    • MSLELoss: Mean Squared Log Error for balanced predictions across skewed LoS distribution
    • MaskedMSELoss: Handles missing data (critical for ICU datasets with 90% missingness)
  • Novel extension: predict_with_uncertainty() method implementing Monte Carlo Dropout for clinical decision support

2. Data Pipeline (pyhealth/tasks/length_of_stay_tpc_mimic4.py)

  • MIMIC-IV preprocessing for 34 time-varying vitals and labs
  • Hourly temporal binning with proper masking
  • Integration of diagnosis codes (ICD-9/ICD-10)
  • Handles irregular sampling and missing data

3. Comprehensive Testing (tests/core/test_tpc.py)

  • 12/12 unit tests passing
  • Tests cover:
    • Model initialization (4 configurations: baseline, shallow, mse_loss, low_dropout)
    • Forward pass with correct output shapes
    • Both loss functions (MSLE and MSE)
    • Backward pass (gradient computation)
    • MC Dropout uncertainty estimation
    • Edge cases (batch size = 1, BatchNorm behavior)
  • No regressions: All existing PyHealth tests still pass

4. Complete Documentation

  • API documentation: docs/api/models/pyhealth.models.tpc.rst
  • Task documentation: docs/api/tasks/pyhealth.tasks.length_of_stay_tpc_mimic4.rst
  • Updated indices: docs/api/models.rst, docs/api/tasks.rst
  • Integrated with PyHealth documentation build system

5. Example Script

  • examples/length_of_stay/length_of_stay_mimic4_tpc.py
  • Full ablation study with 4 configurations
  • Ready for deployment in high-RAM environments

Key Features

Architecture Innovations

  • Depthwise separable convolutions adapted for multivariate time series
  • Channel groups = number of features: Each vital sign processed independently
  • MSLE loss in log-space: Handles right-skewed ICU length-of-stay distribution (median: 46.9 hrs, mean: 93.7 hrs)
  • 3 layers, 45% dropout (optimal configuration from paper's ablation studies)

Novel Extension: Monte Carlo Dropout Uncertainty

  • Provides prediction confidence intervals for clinical decision support
  • Implements Gal & Ghahramani (ICML 2016) methodology
  • Enables risk-stratified patient management
  • Tested and validated in unit tests

Testing Results

$ pytest tests/core/test_tpc.py -v
========================== 12 passed in 17.93s ==========================

Test Coverage:

  • Initialization: All 4 configurations (baseline, shallow, mse_loss, low_dropout)
  • Forward pass: Correct shapes accounting for time_before_pred offset
  • Loss computation: Both MSLE and Masked MSE
  • Backward pass: Gradients computed correctly
  • MC Dropout: 20 samples, mean + std outputs
  • Edge cases: BatchNorm with batch_size=1, short sequences

Reproduction Notes

What We Implemented

  • Complete end-to-end pipeline matching paper's architecture
  • All components tested and validated

Computational Constraints

The full ablation study requires >8GB RAM for MIMIC-IV dataset processing (chartevents.csv.gz expands from 3.5GB to ~15GB in memory). Our development machines (8GB RAM) consistently hit MemoryError.

Per instructor guidance: We validated all functionality through comprehensive unit tests and synthetic data demonstrations. The implementation is complete and ready for deployment when appropriate computational resources are available.

Validation Approach

  • 12 comprehensive unit tests prove all components work correctly
  • 4 ablation configurations validated (initialization, forward/backward passes)
  • Both loss functions compute correctly
  • MC Dropout extension produces mean predictions + uncertainty estimates
  • Model is trainable (gradients flow correctly)

Paper Reference

Title: Temporal Pointwise Convolutional Networks for Length of Stay Prediction in the Intensive Care Unit

Authors: Emma Rocheteau, Pietro Liò, Stephanie Hyland

Venue: CHIL 2021 (Conference on Health, Inference, and Learning)

Paper: Proceedings of Machine Learning Research, vol 149, pages 58-68

Results (MIMIC-IV):

  • TPC: 1.63 days MAE (best)
  • LSTM: 1.88 days MAE
  • Transformer: 1.79 days MAE
  • Standard CNN: 2.01 days MAE

Files Changed

Core Implementation:

  • pyhealth/models/tpc.py - Model implementation (505 lines)
  • pyhealth/models/__init__.py - Model registration
  • pyhealth/tasks/length_of_stay_tpc_mimic4.py - Data pipeline
  • tests/core/test_tpc.py - Comprehensive tests (12 tests)

Documentation:

  • docs/api/models/pyhealth.models.tpc.rst
  • docs/api/tasks/pyhealth.tasks.length_of_stay_tpc_mimic4.rst
  • docs/api/models.rst, docs/api/tasks.rst - Index updates

Examples & Resources:

  • examples/length_of_stay/length_of_stay_mimic4_tpc.py
  • test-resources/core/mimic4demo/icu/d_items.csv.gz
  • test-resources/core/mimic4demo/icu/chartevents.csv

Supporting:

  • pyhealth/datasets/configs/mimic4_ehr.yaml - Chartevents support

ParadoxicalNerd and others added 22 commits April 4, 2026 15:54
- Implement TPC (Temporal Pointwise Convolutional) model for length-of-stay prediction
- Add RemainingLOSMIMIC4 task for MIMIC-IV dataset
- Create 12 comprehensive unit tests (all passing)
- Add complete API documentation (RST files)
- Include ablation study script with 4 configurations + MC Dropout
- Fix: dtype bug in tpc.py line 519, BatchNorm edge cases
- All existing tests pass (no regressions)

Note: Ablation study requires 16GB+ RAM due to large MIMIC-IV chartevents (3.5GB).
Groupmates with adequate resources can run: examples/length_of_stay/length_of_stay_mimic4_tpc.py
- Synthetic MIMIC-IV data generation (300 patients, 34 features)
- Complete training pipeline with PyHealth integration
- Ablation study: baseline (2.727d MAE), shallow (2.506d MAE), high_dropout (2.750d MAE)
- Best configuration: shallow_network (1-layer TPC)
- Demonstrates: dataset creation, model training, evaluation, results export
- Extra credit: 10 points
@tarakjc2c tarakjc2c closed this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants